home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Linux Cubed Series 2: Applications
/
Linux Cubed Series 2 - Applications.iso
/
editors
/
emacs
/
xemacs
/
xemacs-1.006
/
xemacs-1
/
lib
/
xemacs-19.13
/
info
/
w3.info-3
< prev
next >
Encoding:
Amiga
Atari
Commodore
DOS
FM Towns/JPY
Macintosh
Macintosh JP
Macintosh to JP
NeXTSTEP
RISC OS/Acorn
Shift JIS
UTF-8
Wrap
GNU Info File
|
1995-09-01
|
48.5 KB
|
1,197 lines
This is Info file ../info/w3.info, produced by Makeinfo-1.63 from the
input file w3.texi.
This file documents the Emacs-w3 World Wide Web browser.
Copyright (C) 1993, 1994, 1995 William M. Perry
Permission is granted to make and distribute verbatim copies of this
manual provided the copyright notice and this permission notice are
preserved on all copies.
File: w3.info, Node: Mailcap File, Prev: Specifying Viewers, Up: MIME Support
Mailcap File
============
NCSA Mosaic and almost all other WWW browsers rely on a separate file
for mapping MIME types to external viewing programs. This takes some of
the burden off of browser developers, so each browser does not have to
support all image formats, or postscript, etc. Instead of having the
users of Emacs-w3 duplicate this in lisp, this file can be parsed using
the `mm-parse-mailcaps' function. This function is called each time w3
is loaded. It tries to locate mimetype files in several places. If the
environment variable `MAILCAPS' is nonempty, then this is assumed to
specify a UNIX-like path of mimetype files (this is a colon separated
string of pathnames). If the `MAILCAPS' environment variable is empty,
then Emacs-w3 looks for these files:
1. `~/.mailcap'
2. `/etc/mailcap'
3. `/usr/etc/mailcap'
4. `/usr/local/etc/mailcap'
This format of this file is specified in RFC 1343, but a brief
synopsis follows (this is taken verbatim from sections of RFC 1343).
Each mailcap file consists of a set of entries that describe the
proper handling of one media type at the local site. For example, one
line might tell how to display a message in Group III fax format. A
mailcap file consists of a sequence of such individual entries,
separated by newlines (according to the operating system's newline
conventions). Blank lines and lines that start with the "#" character
(ASCII 35) are considered comments, and are ignored. Long entries may
be continued on multiple lines if each non-terminal line ends with a
backslash character ('\', ASCII 92), in which case the multiple lines
are to be treated as a single mailcap entry. Note that for such
"continued" lines, the backslash must be the last character on the line
to be continued.
Each mailcap entry consists of a number of fields, separated by
semi-colons. The first two fields are required, and must occur in the
specified order. The remaining fields are optional, and may appear in
any order.
The first field is the content-type, which indicates the type of data
this mailcap entry describes how to handle. It is to be matched against
the type/subtype specification in the "Content-Type" header field of an
Internet mail message. If the subtype is specified as "*", it is
intended to match all subtypes of the named content-type.
The second field, view-command, is a specification of how the
message or body part can be viewed at the local site. Although the
syntax of this field is fully specified, the semantics of program
execution are necessarily somewhat operating system dependent.
The optional fields, which may be given in any order, are as follows:
* The "compose" field may be used to specify a program that can be
used to compose a new body or body part in the given format. Its
intended use is to support mail composing agents that support the
composition of multiple types of mail using external composing
agents. As with the view- command, the semantics of program
execution are operating system dependent. The result of the
composing program may be data that is not yet suitable for mail
transport--that is, a Content-Transfer-Encoding may need to be
applied to the data.
* The "composetyped" field is similar to the "compose" field, but is
to be used when the composing program needs to specify the
Content-type header field to be applied to the composed data. The
"compose" field is simpler, and is preferred for use with existing
(non-mail-oriented) programs for composing data in a given format.
The "composetyped" field is necessary when the Content-type
information must include auxilliary parameters, and the
composition program must then know enough about mail formats to
produce output that includes the mail type information.
* The "edit" field may be used to specify a program that can be used
to edit a body or body part in the given format. In many cases,
it may be identical in content to the "compose" field, and shares
the operating-system dependent semantics for program execution.
* The "print" field may be used to specify a program that can be
used to print a message or body part in the given format. As with
the view-command, the semantics of program execution are operating
system dependent.
* The "test" field may be used to test some external condition (e.g.
the machine architecture, or the window system in use) to
determine whether or not the mailcap line applies. It specifies a
program to be run to test some condition. The semantics of
execution and of the value returned by the test program are
operating system dependent. If the test fails, a subsequent
mailcap entry should be sought. Multiple test fields are not
permitted--since a test can call a program, it can already be
arbitrarily complex.
* The "needsterminal" field indicates that the view-command must be
run on an interactive terminal. This is needed to inform
window-oriented user agents that an interactive terminal is
needed. (The decision is not left exclusively to the view-command
because in some circumstances it may not be possible for such
programs to tell whether or not they are on interactive
terminals.) The needsterminal command should be assumed to apply
to the compose and edit commands, too, if they exist. Note that
this is NOT a test--it is a requirement for the environment in
which the program will be executed, and should typically cause the
creation of a terminal window when not executed on either a real
terminal or a terminal window.
* The "copiousoutput" field indicates that the output from the
view-command will be an extended stream of output, and is to be
interpreted as advice to the UA (User Agent mail- reading program)
that the output should be either paged or made scrollable. Note
that it is probably a mistake if needsterminal and copiousoutput
are both specified.
* The "description" field simply provides a textual description,
optionally quoted, that describes the type of data, to be used
optionally by mail readers that wish to describe the data before
offering to display it.
* The "x11-bitmap" field names a file, in X11 bitmap (xbm) format,
which points to an appropriate icon to be used to visually denote
the presence of this kind of data.
* Any other fields beginning with "x-" may be included for local or
mailer-specific extensions of this format. Implementations should
simply ignore all such unrecognized fields to permit such
extensions, some of which might be standardized in a future
version of this document.
File: w3.info, Node: Security, Next: Non-Unix Operating Systems, Up: Top
Security
********
* Menu:
* Basic:: The 'Basic' authentication scheme for HTTP/1.0
* Digest:: The 'Digest' authentication scheme for HTTP/1.0
* SSL:: Secure Sockets Layer from Netscape, and how to
enable it in Emacs-w3.
* PGP/PEM:: Using PGP/PEM to encrypt information
There are an increasing number of ways to authenticate yourself to a
web servivce. Emacs-w3 tries to support as many as possible.
File: w3.info, Node: Basic, Next: Digest, Prev: Security, Up: Security
HTTP/1.0 Basic Authentication
=============================
The weakest authentication available, not recommended if you are at
all serious about security on your web site. This is simply a string
that looks like `user:password' that has been Base64 encoded, as
defined in RFC 1421. It is given as an example of how to write an
authorization module. All of the functions for storing, retrieving,
and over-writing the cached authorization information should all be
handled by one function (although it would be perfectly acceptable to
have a stub function that passed off to three larger functions based on
its parameters). The most efficient way to store the cached
information is by an assoc-list of assoc-lists. The top level assoc
list is keyed on the name of the server. The secondary assoc-list is
keyed on the full path of the file that is protected. Thus, a sample
authorization cache would look like this:
((``info.cern.ch'' . ((``/foo'' . ``d21wZXJyeTp0ZXN0aW5n'')
(``/bar'' . ``amtvbnJhdGg6ZGlzbWVtYmVy'')
(``/foo/x.html'' . ``dmlvbGV0dDpvcGVuZ2w='')))
(``cs.indiana.edu'' . ((``/elisp/w3/'' . ``dGxvb3M6Y29ucXVlcg=='')
(``/'' . ``bXZhbmhleW46a2lsbGh1bGljaw=='')))
)
The structure consists of two assoc-lists for the sake of speed. The
list of cached information could conceivably hold several thousand links
(if the user does not exit Emacs for long periods of time.) If the list
were keyed on a full URL, the assoc function would have to search
through every link before failing to find a new URL. With the current
scheme, assoc only has to search though a few items (maximum is the
number of HTTP servers, which should always be much, much smaller than
the number of distinct URLs.) Even with a 3:1 ratio of URLs to each
server, this is a big win.
File: w3.info, Node: Digest, Next: SSL, Prev: Basic, Up: Security
HTTP/1.0 Digest Authentication
==============================
Jeffery L. Hostetler, John Franks, Philip Hallam-Baker, Ari Luotonen,
Eric W. Sink, and Lawrence C. Stewart have an internet draft for a new
authentication mechanism. For the complete specification, please see
draft-ietf-http-digest-aa-01.txt in your nearest internet drafts
archive(1). What follows is mainly taken from the March 24, 1995
version of the internet draft.
The protocol referred to as "HTTP/1.0" includes specification for a
Basic Access Authentication scheme. This scheme is not considered to
be a secure method of user authentication, as the user name and password
are passed over the network in an unencrypted form. A specification for
a new authentication scheme is needed for future versions of the HTTP
protocol. This document provides specification for such a scheme,
referred to as "Digest Access Authentication".
The Digest Access Authentication scheme is not intended to be a
complete answer to the need for security in the World Wide Web. This
scheme provides no encryption of object content. The intent is simply
to facilitate secure access authentication.
Like Basic Access Authentication, the Digest scheme is based on a
simple challenge-response paradigm. The Digest scheme challenges using
a nonce value. A valid response contains the MD5 checksum of the
password and the given nonce value. In this way, the password is never
sent in the clear. Just as with the Basic scheme, the username and
password must be prearranged in some fashion.
If a server receives a request for an access-protected object, and an
acceptable Authorizatation header is not sent, the server responds with:
HTTP/1.1 401 Unauthorized
WWW-Authenticate: Digest realm="<realm>",
domain="<domain>",
nonce="<nonce>",
opaque="<opaque>",
stale="<TRUE | FALSE>"
The meanings of the identifers used above are as follows:
`<realm>'
A name given to users so they know which username and password to
send.
`<domain> OPTIONAL'
A comma separated list of URIs, as specified for HTTP/1.0. The
intent is that the client could use this information to know the
set of URIs for which the same authentication information should
be sent. The URIs in this list may exist on different servers.
If this keyword is omitted or empty, the client should assume that
the domain consists of all URIs on the responding server.
`<nonce>'
A server-specified integer value which may be uniquely generated
each time a 401 response is made. Servers may defend themselves
against replay attacks by refusing to reuse nonce values. The
nonce should be considered opqaue by the client.
`<opaque> OPTIONAL'
A string of data, specified by the server, which should returned
by the client unchanged. It is recommended that this string be
base64 or hexadecimal data. Specifically, since the string is
passed in the header lines as a quoted string, the double-quote
character is not allowed.
`<stale> OPTIONAL'
A flag, indicating that the previous request from the client was
rejected because the nonce value was stale. If stale is TRUE, the
client may wish to simply retry the request with a new encrypted
response, without reprompting the user for a new username and
password.
The client is expected to retry the request, passing an Authorization
header line as follows:
Authorization: Digest
username="<username>", -- required
realm="<realm>", -- required
nonce="<nonce>", -- required
uri="<requested-uri>", -- required
response="<digest>", -- required
message="<message-digest>", -- OPTIONAL
opaque="<opaque>" -- required if provided by server
where <digest> := H( H(A1) + ":" + N + ":" + H(A2) )
and <message-digest> := H( H(A1) + ":" + N + ":" + H(<message-body>) )
where:
A1 := U + ':' + R + ':' + P
A2 := <Method> + ':' + <requested-uri>
with:
N -- nonce value
U -- username
R -- realm
P -- password
<Method> -- from header line 0
<requested-uri> -- uri sans proxy/routing
Where H() is the RSA Data Security, Inc. MD5 Message-Digest Algorithm
(2).
Upon receiving the Authorization information, the server may check
its validity by looking up its known password which corresponds to the
submitted <username>. Then, the server must perform the same MD5
operation performed by the client, and compare the result to the given
<response>.
Note that the HTTP server does not actually need to know the user's
clear text password. As long as H(A1) is available to the server, the
validity of an Authorization header may be verified.
All keyword-value pairs must be expressed in characters from the
US-ASCII character set, excluding control characters.
---------- Footnotes ----------
(1) One is ftp://ds.internic.net/internet-drafts
(2) RFC 1321. R.Rivest, "The MD5 Message-Digest Algorithm",
http://ds.internic.net/rfc/rfc1321.txt, April 1992.
File: w3.info, Node: SSL, Next: PGP/PEM, Prev: Digest, Up: Security
SSL
===
SSL is the `Secure Sockets Layer' interface developed by Netscape
Communications (1).
In order to use SSL in Emacs-w3, you will need one of the reference
implementations of SSL that are publicly available. These are the
implementations that I am aware of:
`SSLRef 2.0'
Available from Netscape Communications at
http://www.netscape.com/newsref/std/sslref.html. This requires the
RSARef library, which is not exportable. The RSARef library is
available from ftp://ftp.rsa.com/rsaref/
`SSLeay 0.4'
An implementation by Eric Young (eay@mincom.oz.au) that is free for
commerial or noncommercial use, and was developed completely
outside the US by a non-US citizen. More information can be found
at ftp://ftp.psy.uq.oz.au/pub/Crypto/SSL/
Whichever reference implementation you choose to download (I
recommend the SSLeay distribution, just to thumb a nose at the NSA :),
you must have a program you can run in a subprocess that takes a
hostname and port number on the command line, and reads/writes to
standard input/output (the Netscape implementation comes with one of
these by default). Once you hvae this program, set the variable
`ssl-program-name' to point to the executable.
This should be all you need to do. In the future, I will be
distributing a set of patches to Emacs 19.xx and XEmacs 19.xx to
SSL-enable them, for the sake of speed.
NOTE: This implementation does not support the use of client
certificates, but then nobody else supports that area of the protocol
either, so I'm not too worried about it.
---------- Footnotes ----------
(1) http://www.netscape.com/
File: w3.info, Node: PGP/PEM, Prev: SSL, Up: Security
PGP/PEM
=======
Most of this section was taken from the documentation written by Rob
McCool robm@ncsa.uiuc.edu. Gratefully reproduced here with permission
from him.(1).
RIPEM is 'Riordan's Internet Privacy Enhanced Mail', and is
currently on version 1.2b3. US citizens can ftp it from
ripem.msu.edu:/pub/crypt/ripem.
PGP is 'Pretty Good Privacy', and is currently on version 2.6. The
legal controversies that plagued earlier versions have been resolved, so
this is a competely legal program now. There is also a legal version
for european users, called 2.6ui (the Unofficial International version).
PGP and PEM are programs to allow you and a second party to
communicate in a way which does not allow third parties to read them,
and which certify that the person who sent the message is really who
they claim they are.
PGP and PEM both use RSA encryption. The U.S. government has strict
export controls over foreign use of this technology, so people outside
the U.S. may have a difficult time finding programs which perform the
encryption.
You will need a working copy of either Pretty Good Privacy or RIPEM
to begin with. You should be familiar with the program and have
generated your own public/private key pair. You should be able to use
the TIS/PEM program with the PEM authorization type. I haven't tried
it. This tutorial is written assuming that you are using RIPEM.
Currently, the protocol has been implemented with PEM and PGP using
local key files on the server side, and on the client side with PEM
using finger to retrieve the server's public key.
As you can tell, parties who wish to use Emacs-w3 and httpd with PEM
or PGP encryption will need to communicate beforehand and find a
tamper-proof way to exchange their public keys.
Pioneers get shot full of arrows. This work is currently in the
experimental stages and thus may have some problems that I have
overlooked. The only known problem that I know about is that the
messages are currently not timestamped. This means that a malicious
user could record your encrypted message with a packet sniffer and
repeat it back to the server ad nauseum. Although they would not be
able to read the reply, if the request was something you were being
charged for, you may have a large bill to pay by the time they're
through.
This protocol is almost word-for-word a copy of Tony Sander's RIPEM
based scheme, generalized a little. Below, wherever you see PEM you can
replace it with PGP and get the same thing.
*Client:*
GET /docs/protected.html HTTP/1.0
UserAgent: Emacs-W3/2.1.x
*Server:*
HTTP/1.0 401 Unauthorized
WWW-Authenticate: PEM entity="webmaster@hoohoo.ncsa.uiuc.edu"
Server: NCSA/1.1
*Client:*
GET / HTTP/1.0
Authorization: PEM entity="robm@ncsa.uiuc.edu"
Content-type: application/x-www-pem-request
--- BEGIN PRIVACY-ENHANCED MESSAGE ---
this is the real request, encrypted
--- END PRIVACY-ENHANCED MESSAGE ---
*Server:*
HTTP/1.0 200 OK
Content-type: application/x-www-pem-reply
--- BEGIN PRIVACY-ENHANCED MESSAGE ---
this is the real reply, encrypted
--- END PRIVACY-ENHANCED MESSAGE ---
That's it.
Emacs-w3 uses the excellent mailcrypt package written by Jin S Choi
jsc@mit.edu.(2). This package takes care of calling ripem and/or pgp
with the correct arguments. Please see the documentation at the top of
mailcrypt.el for instructions on using mailcrypt. All bug reports
about mailcrypt should go to Jin S Choi, but bugs about how I use it in
Emacs-w3 should of course be directed to me.
---------- Footnotes ----------
(1) See http://hoohoo.ncsa.uiuc.edu/docs/PEMPGP.html
(2) Available via anonymous ftp to archive.cis.ohio-state.edu in
/pub/gnu/emacs/elisp-archive/interfaces/mailcrypt.el.Z
File: w3.info, Node: Non-Unix Operating Systems, Next: VMS, Prev: Security, Up: Top
Non-Unix Operating Systems
**************************
* Menu:
* VMS:: The wonderful world of VAX|AXP-VMS!
* OS/2:: The next-best thing to Unix.
* MS-DOS:: The wonderful world of MS-DOG!
* 16-Bit Windows:: Windows 3.1, 3.11, and WFW 3.11.
* 32-Bit Windows:: Windows NT, Chicago/Windows 95.
* Macintosh:: The wonderful world of Macintrash!
* Amiga:: The Amiga, for those who still love them.
File: w3.info, Node: VMS, Next: OS/2, Prev: Non-Unix Operating Systems, Up: Non-Unix Operating Systems
VMS
===
:: WORK ::
File: w3.info, Node: OS/2, Next: MS-DOS, Prev: VMS, Up: Non-Unix Operating Systems
OS/2
====
:: WORK ::
File: w3.info, Node: MS-DOS, Next: 16-Bit Windows, Prev: OS/2, Up: Non-Unix Operating Systems
MS-DOS
======
:: WORK ::
File: w3.info, Node: 16-Bit Windows, Next: 32-Bit Windows, Prev: MS-DOS, Up: Non-Unix Operating Systems
16-Bit Windows
==============
:: WORK ::
File: w3.info, Node: 32-Bit Windows, Next: Macintosh, Prev: 16-Bit Windows, Up: Non-Unix Operating Systems
32-Bit Windows
==============
:: WORK ::
File: w3.info, Node: Macintosh, Next: Amiga, Prev: 32-Bit Windows, Up: Non-Unix Operating Systems
Macintosh
=========
:: WORK ::
File: w3.info, Node: Amiga, Next: Advanced Features, Prev: Macintosh, Up: Non-Unix Operating Systems
Amiga
=====
:: WORK ::
File: w3.info, Node: Advanced Features, Next: Style Sheets, Prev: Amiga, Up: Top
Advanced Features
*****************
* Menu:
* Style Sheets:: Formatting control, the right way
* Disk Caching:: Speeding performance by using a local disk cache
* Searching:: How to search entire sections of the web
* Interfacing to Mail/News:: How to make VM understand hypertext links
* Debugging HTML:: How to make Emacs-w3 display warnings about invalid
HTML/HTML+ constructs.
* Native WAIS Support:: How to make Emacs-w3 understand WAIS links without
using a gateway.
* Rating Links:: How to make Emacs-w3 put an 'interestingness' value
next to each link.
* Gopher Plus Support:: How Emacs-w3 makes use of the Gopher+ protocol.
* Hooks:: Various hooks to use throughout Emacs-w3
* Other Variables:: Miscellaneous variables that control the real
guts of Emacs-w3.
File: w3.info, Node: Style Sheets, Next: Disk Caching, Prev: Advanced Features, Up: Advanced Features
Style Sheets
============
Emacs-w3 currently supports the experimental style sheet mechanism
proposed by H&kon W. Lie of the W3 Consortium. This allows for the
author to specify what a document should look like, and yet allow the
end user to override any of the stylistic changes. This allows for
people with special needs (most notably the visually impaired) to
override style bindings that could make a document totally unreadable.
A stylesheet consists of comments and directives. A comment is any
line starting with a #, and is terminated by the end of the line. A
directive includes the tag name, an attribute name, and a value. A
sample stylesheet is:
<style notation="experimental">
# This line is a comment
# These will be ignored, up the the terminating end-of-line
#
h1: align=center
h1: color.text=yellow
h1: color.background=red
h1: font.size *= 2
</style>
Below is a comprehensive list of the attribute names.
`color.text'
Specifies the foreground color of the text for this item.
`color.background'
Specifies the background color of the text for this item.
`background.bitmap'
Specifies a bitmap to be used as the background for this item.
`font.size'
Specifies the font size. This can be specified with the +=, -=,
/=, or *= operator, signifying a change from the default font
size. For example, font.size *= 2 would mean a font twice as
large as the default font.
`font.style'
Specifies the font style. This controls whether a font is bold,
italic, underlined, or any combination of these. The value can be
a comma or ampersand (&) separated list of values.
`font.family'
Specifies the font family - this is the basic type of font. Note
that not all font families will be available on all platforms, or
even the same platform in a slightly different configuration. If
the specified font family cannot be found on the machine, the
default font is used instead.
`align'
Specifies how the text contained within the item is to be aligned.
Possible values are left, right, justify, center, or indent.
`width'
Specifies how wide the item should be. This is only used for
horizontal rules (<HR>) tags right now.
To include a stylesheet into your document, simply use the <style>
tag. You can use the notation attribute to specify what language the
stylesheet is specified in. The default is experimental. The data
between the <style> and </style> tags is the stylsheet proper - no HTML
parsing is done to this data - it is treated similar to an <XMP> section
of text. To reference an external stylesheet, you should use the <link>
tag.
<link rel="stylesheet" href="/bill.style">
If these two mechanisms are mixed, then the URL is resolved first,
and the contents of the <style> tag take precedence if there are any
conflicting directives.
In the future, DSSSL and DSSSL-lite will be supported as valid
stylesheet languages, but not in this release.
File: w3.info, Node: Disk Caching, Next: Searching, Prev: Style Sheets, Up: Advanced Features
Disk Caching
============
A cache stores the information on a page on your local machine. When
requesting a page that is in the cache, Emacs-w3 can retrieve the page
from the cache more quickly than retrieving the page again from its
location out on the network. With a well-populated cache, the speed of
browsing the web is dramatically increased.
The first time a page is requested, Emacs-w3 retrieves the page from
the network. When requesting a page that is in the cache, Emacs-w3
checks to see if the page has changed since it was last retrieved from
the remote machine. If it has not changed, the local copy is used,
saving the transmission of the file over the network.
To turn on disk caching, set the variable `url-automatic-caching' to
non-`nil', or choose the 'Caching' menu item (under `Options'). That
is all there is to it. It is recommended that you use the
`clean-cache' shell script fist, to allow for future cleaning of the
cache. This shell script will remove all files that have not been
accessed since it was last run. To keep the cache pared down, it is
recommended that this script be run from at or cron (see the manual
pages for crontab(5) or at(1) for more information)
With a large cache of documents on the local disk, it can be very
handy when traveling, or any other time the network connection is not
active (a laptop with a dial-on-demand PPP connection, etc). Emacs-w3
can rely solely on its cache, and avoid checking to see if the page has
changed on the remote server. In the case of a dial-on-demand PPP
connection, this will keep the phone line free as long as possible,
only bringing up the PPP connection when asking for a page that is not
located in the cache. This is very useful for demonstrations as well.
To turn this feature on, set the variable `url-standalone-mode' to
non-`nil', or choose the `Use Cache Only' menu item (under `Options')
Emacs-w3 caches files under the temporary directory specified by
`url-temporary-directory', in a user-specific subdirectory (determined
by the `user-real-login-name' function). The cache files are stored
under their original names, so a URL like:
http://www.spry.com/foo/bar/baz.html would be stored in a cache file
named: /tmp/wmperry/com/spry/www/foo/bar/baz.html. Sometimes, espcially
with gopher links, there will be name conflicts, and an error will be
signalled. This cannot be avoided, and still have reasonable
performance at startup time (reading in an index file of all the cached
pages can take a long time on slow machines, or even fast machines with
large caches). If you are running XEmacs 19.12 or later, you can use an
alternate naming scheme that avoids name conflicts, but loses the human
readability of the cache file names. The cache files will look like:
/tmp/wmperry/acbd18db4cc2f85cedef654fccc4a4d8, which is certainly
unique, but not very user-friendly. To turn this on, add this to your
`.emacs' file:
(add-hook 'w3-load-hooks '(lambda ()
(fset 'url-create-cached-filename
'url-create-cached-filename-using-md5)))
If you will not be using other emacs variants, I highly recommend
this method of creating the cache filename.
File: w3.info, Node: Searching, Next: Interfacing to Mail/News, Prev: Disk Caching, Up: Advanced Features
Searching
=========
In the file `w3-search.el' is a function that some may find handy.
It is not 100% completed yet, so if you run into any problems with it,
please try to fix it, not just say its broken.
The function is `w3-do-search'. It must be called with at least one
argument. All others are optional. Arguments are TERM, BASE,
HOPS-LIMIT, and RESTRICTION. This recursively descends all the child
links of the current document for TERM. TERM may be a string, in which
case it is treated as a regular expression, and `re-search-forward' is
used, or a symbol, in which case it is funcalled with 1 argument, the
current URL being searched.
BASE is the URL to start searching from.
HOPS-LIMIT is the maximum number of nodes to descend before the
search dies out.
RESTRICTION is a regular expression or function to call with one
argument, a URL that could be searched. If RESTRICTION returns
non-`nil', then the URL is added to the queue, otherwise it is
discarded. This is useful for restricting searching to either certain
types of URLs (only search ftp links), or restricting searching to one
domain (only search stuff in the indiana.edu domain).
You may check several variables from the main `w3-do-search' routine
in any functions passed to it (as RESTRICTION or TERM). QUEUE is the
queue of links to be searched, HOPS is the current number of hops from
the root document, RESULTS is an assoc list of (URL . RETVAL), where
RETVAL is the value returned from previous calls to the TERM function
(or point if searching for a regular expression).
The function returns a list of the form: ((URL . RETVAL)...)
Please note that there is no interactive use for this function
yet--it was designed for non-interactive, batch-mode processing.
However, if anyone wants to write a wrapper function for it, please feel
free.
File: w3.info, Node: Interfacing to Mail/News, Next: Debugging HTML, Prev: Searching, Up: Advanced Features
Interfacing to Mail/News
========================
More and more people are including URLs in their signatures, and
within the body of mail messages. It can get quite tedious to type
these into the minibuffer to follow one.
To access URLs with VM, the following in your `~/.emacs' or `~/.vm'
files should do the trick. It adds two keybindings to the main VM
message window. The middle mouse button now tries to follow a
hypertext link.
(add-hook 'vm-mode-hook
(function
(lambda ()
(define-key vm-mode-map [mouse-2] 'w3-maybe-follow-link-mouse)
(define-key vm-mode-map "\r" 'w3-maybe-follow-link))))
To access URLs with RMAIL, the following in your `~/.emacs' file
should do the trick.
(add-hook 'rmail-mode-hook
(function
(lambda ()
(define-key rmail-mode-map [mouse-2] 'w3-maybe-follow-link-mouse)
(define-key rmail-mode-map "\r" 'w3-maybe-follow-link))))
To access URLs with GNUS, the following in your `~/.emacs' file
should od the trick.
(add-hook 'gnus-article-mode-hook
(function
(lambda ()
(define-key gnus-article-mode-map [mouse-2]
'w3-maybe-follow-link-mouse)
(define-key gnus-article-mode-map "\r"
'w3-maybe-follow-link))))
NOTE: XEmacs 19.12 has a special version of VM and GNUS that does the
highlighting of URLs automatically. All that is required to follow one
of these links is clicking the middle mouse button on the highlighted
text.
File: w3.info, Node: Debugging HTML, Next: Native WAIS Support, Prev: Interfacing to Mail/News, Up: Advanced Features
Debugging HTML
==============
If you are feeling adventurous, or are just as anal as I am about
people writing valid HTML, you can set the variable `w3-debug-html' to
`t' and see what happens.
If a emacs-w3 thinks it has encountered invalid HTML, then a
debugging message is logged to the buffer specified by
`w3-debug-buffer'. This can be a buffer object, or the name of a
buffer.
NOTE: This has not yet been reintegrated into the new display engine
and parser.
File: w3.info, Node: Native WAIS Support, Next: Rating Links, Prev: Debugging HTML, Up: Advanced Features
Native WAIS Support
===================
This version of Emacs-W3 supports native WAIS querying (earlier
versions required the use of a gateway program). In order to use the
native WAIS support, a working "waisq" binary is required. I recommend
the distribution from think.com -
ftp://think.com/wais/wais-8-b6.1.tar.Z is a good place to start.
The variable `url-waisq-prog' must point to this executable, and one
of `url-wais-gateway-server' or `url-wais-gateway-port' should be `nil'.
When a WAIS URL is encountered, a form will be automatically
generated and displayed. After typing in your search term, the query
will be sent to the server by running the `url-waisq-prog' in a
subprocess. The results will be converted into HTML and displayed.
File: w3.info, Node: Rating Links, Next: Gopher Plus Support, Prev: Native WAIS Support, Up: Advanced Features
Rating Links
============
The `w3-link-delimiter-info' variable can be used to 'rate' a URL
when it shows up in an HTML page. If non-`nil', then this should be a
list specifying (or a symbol specifying the name) of a function. This
function should expect one argument, a fully specified URL, and should
return a string. This string is inserted after the link text.
If a user has decided that all links served from blort.com are too
laden with images, and wants to be warned that a link points at this
host, they could do something like this:
(defun check-url (url)
(if (string-match "://[^/]blort.com" url)
"[SLOW!]" ""))
(setq w3-link-delimiter-info 'check-url)
So that all links pointing to any site at blort.com shows up as "Some
link[SLOW!]" instead of just "Some link".
File: w3.info, Node: Gopher Plus Support, Next: Hooks, Prev: Rating Links, Up: Advanced Features
Gopher+ Support
===============
The gopher+ support in Emacs-w3 is limited to the conversion of ASK
blocks into HTML 3.0 forms, and the usage of the content-length given by
the gopher+ server to give a nice status bar on the bottom of the
screen.
This will hopefully be extended to include the Gopher+ method of
content-type negotiation, but this may be a while.
File: w3.info, Node: Hooks, Next: Other Variables, Prev: Gopher Plus Support, Up: Advanced Features
Hooks
=====
These are the various hooks that can be used to customize some of
Emacs-w3's behavior. They are arranged in the order in which they would
happen when retrieving a document. All of these are functions (or lists
of functions) that are called consecutively.
`w3-load-hooks'
These hooks are run by `w3-do-setup' the first time a URL is
fetched. All the w3 variables are initialized before this hook is
run.
`w3-file-done-hooks'
These hooks are run by `w3-prepare-buffer' after all parsing on a
document has been done. All `url-current-'* and `w3-current-'*
variables are initialized when this hook is run. This is run
before the buffer is shown, and before any inlined images are
downloaded and converted.
`w3-file-prepare-hooks'
These hooks are run by `w3-prepare-buffer' before any parsing is
done on the HTML file. The HTTP/1.0 headers specified by
`w3-show-headers' have been inserted, the syntax table has been set
to `w3-parse-args-syntax-table', and any personal annotations have
been inserted by the time this hook is run.
`w3-mode-hooks'
These hooks are run after a buffer has been parsed and displayed,
but before any inlined images are downloaded and converted.
`w3-source-file-hooks'
These hooks are run after displaying a document's source
File: w3.info, Node: Other Variables, Prev: Hooks, Up: Advanced Features
Miscellaneous variables
=======================
There are lots of variables that control the real nitty-gritty of
Emacs-w3 that the beginning user probably shouldn't mess with. Here
they are.
`w3-icon-directory-list'
A list of directorys to look in for the w3 standard icons... must
end in a /! If the directory `data-directory'/w3 exists, then
this is automatically added to the default value of
http://cs.indiana.edu/elisp/w3/icons/.
`w3-keep-old-buffers'
Whether to keep old buffers around when following links. If you
do not like having lots of buffers in one Emacs session, you
should set this to `nil'. I recommend setting it to `t', so that
backtracking from one link to another is faster.
`url-passwd-entry-func'
This is a symbol indicating which function to call to read in a
password. It is set up depending on whether you are running "EFS"
or "ange-ftp" at startup if it is `nil'. This function should
accept the prompt string as its first argument, and the default
value as its second argument.
`w3-reuse-buffers'
Determines what happens when `w3-fetch' is called on a document
that has already been loaded into another buffer. Possible values
are: `nil', `yes', and `no'. `nil' will ask the user if Emacs-w3
should reuse the buffer (this is the default value). A value of
`yes' means assume the user wants to always reuse the buffer. A
value of `no' means assume the user always wants to re-fetch the
document.
`w3-show-headers'
This is a list of HTTP/1.0 headers to show at the end of a buffer.
All the headers should be in lowercase. They are inserted at the
end of the buffer in a <UL> list. Alternatively, if this is
simply `t', then all the HTTP/1.0 headers are shown. The default
value is `nil'.
`w3-show-status, url-show-status'
Whether to show progress messages in the minibuffer.
`w3-show-status' controls if messages about the parsing are
displayed, and `url-show-status' controls if a running total of the
number of bytes transferred is displayed. These Can cause a large
performance hit if using a remote X display over a slow link, or a
terminal with a slow modem.
`mm-content-transfer-encodings'
An assoc list of CONTENT-TRANSFER-ENCODINGS or CONTENT-ENCODINGS
and the appropriate decoding algorithms for each. If the `cdr' of
a node is a list, then this specifies the decoder is an external
program, with the program as the first item in the list, and the
rest of the list specifying arguments to be passed on the command
line. If using an external decoder, it must accept its input from
`stdin' and send its output to `stdout'.
If the `cdr' of a node is a symbol whose function definition is
non-`nil', then that encoding can be handled internally. The
function is called with 2 arguments, buffer positions bounding the
region to be decoded. The function should completely replace that
region with the unencoded information.
Currently supported transfer encodings are: base64, x-gzip, 7bit,
8bit, binary, x-compress, x-hqx, and quoted-printable.
`url-uncompressor-alist'
An assoc list of file extensions and the appropriate uncompression
programs for each. This is used to build the Accept-encoding
header for HTTP/1.0 requests.
`url-waisq-prog'
Name of the waisq executable on this system. This should be the
`waisq' program from think.com's wais8-b5.1 distribution.
File: w3.info, Node: More Help, Next: Future Directions, Up: Top
More Help
*********
If you need more help on Emacs-w3, please send me mail
(wmperry@spry.com). Several discussion lists have also been created
for Emacs-w3. To subscribe, send mail to majordomo@indiana.edu, with
the body of the message 'subscribe LISTNAME <YOUR EMAIL ADDRES>'. All
other mail should go to <listname>@indiana.edu.
* w3-announce - this list is for anyone interested in Emacs-w3, and
should in general only be used by me. The gnu.emacs.sources
newsgroup and a few other mailing lists are included on this. You
may use this if you have written an enhancement to Emacs-w3 that
you wish more people to know about. (www-announce@w3.org is
included on this list).
* w3-beta - this list is for beta testers of Emacs-w3. These brave
souls test out not-quite stable code.
* w3-dev - a list consisting of myself and a few other people who are
interested in the internals of Emacs-w3, and doing active
development work. Pretty dead right now, but I hope it will grow.
If you need more help on the World Wide Web in general, please refer
to the newsgroup comp.infosystems.www. There are also several
discussion lists concerning the Web. Send mail to listserv@w3.org with
a subject line of 'subscribe <listname>'. All mail should go to
<listname>@w3.org. Administrative mail should go to www-admin@w3.org.
The lists are:
* www-talk - for general discussion of the World Wide Web, where its
going, new features, etc. All the major developers are subscribed
to this list.
* www-announce - for announcements concerning the World Wide Web.
Server changes, new servers, new software, etc.
As a last resort, you may always mail me. I'll try to answer as
quickly as I can.
File: w3.info, Node: Future Directions, Next: Programming Interface, Prev: More Help, Up: Top
Future Directions
*****************
Changes are constantly being made to the Emacs browser (hopefully all
for the better). This is a list of the things that are being worked on
right now.
Fix before 2.3
1. Imagemap extensions (drag areas)
2. PATHs
3. TABLEs
4. MATHs
5. DSSSL and DSSSL-Lite Style sheets
Long range goals
1. Multi-DTD browsing
File: w3.info, Node: Programming Interface, Next: Generalized ZONES, Prev: Future Directions, Up: Top
Internals of Emacs-w3
*********************
This chapter attempts to explain some of the internal workings of
Emacs-w3 and various data structures that are used. It also details
some functions that are useful for using some of the Emacs-w3
functionality from within your own programs, or extending the current
capabilities of Emacs-w3.
* Menu:
* Generalized ZONES:: A generic interface to 'zones' of text
that can contain information.
* Global Variables:: Global variables used throughout Emacs-w3
* Data Structures:: The various data structures used in Emacs-w3
* Miscellaneous Functions:: Miscellaneous functions you can use to
interface with w3 and access its data
structures
* MIME functions:: MIME functions--parsing messages,
mailcap files, and more.
File: w3.info, Node: Generalized ZONES, Next: Global Variables, Prev: Programming Interface, Up: Programming Interface
Programming Interface
Generalized ZONES
=================
Due to the many different flavors of Emacs in existence, the
addition of data and font information to arbitrary regions of text has
been generalized. The following functions are defined for
using/manipulating these "zones" of data.
`w3-add-zone (start end style data &optional highlight)'
This function creates a zone between buffer positions start and
end, with font information specified by style, and a data segment
of data. If the optional argument highlight is non-`nil', then
the region highlights when the mouse moves over it.
`w3-zone-at (point)'
Returns the the zone at POINT. Preference is given to hypertext
links, then to form entry areas, then to inlined images. So if an
inlined image was part of a hypertext link, this would always
return the hypertext link.
`w3-zone-data (zone)'
Returns the zone's data segment. The data structures used in
Emacs-w3 are relatively simple. They are just list structures
that follow a certain format. The two main data types are "form
objects", "link objects",and "inlined images". All the
information for these types of links are stored as lists.
`w3-zone-hidden-p (zone)'
Returns `t' if and only if a zone is currently invisible.
`w3-hide-zone (start end)'
Makes a region of text from `start' to `end' invisible.
`w3-unhide-zone (start end)'
Makes a region of text from `start' to `end' visible again.
`w3-zone-start (zone)'
Returns an integer that is the start of zone, as a buffer
position. In Emacs 18.xx, this returns a marker instead of an
integer, but it can be used just like an integer.
`w3-zone-end (zone)'
Returns an integer that is the end of zone, as a buffer position.
In Emacs 18.xx, this returns a marker instead of an integer, but
it can be used just like an integer.
`w3-zone-eq (zone1 zone2)'
Returns `t' if and only if zone1 and zone2 represent the same
region of text in the same buffer, with the same properties and
data.
`w3-delete-zone (zone)'
Removes zone from its buffer (or current buffer). The return
value is irrelevant, and varies for each version of Emacs.
`w3-all-zones ()'
Returns a list of all the zones contained in the current buffer.
Useful for extracting information about hypertext links or form
entry areas. Programs should not rely on this list being sorted,
as the order varies with each version of Emacs.
`w3-zone-at (pt)'
This returns the zone at character position PT in the current
buffer that is either a link or a forms entry area. Returns `nil'
if no link at point. These data structures are what is generally
returned by `w3-zone-data'.
File: w3.info, Node: Global Variables, Next: Data Structures, Prev: Generalized ZONES, Up: Programming Interface
Global variables
================
There are also some variables that may be useful if you are writing a
program or function that interacts with Emacs-w3. All of the
`w3-current-*' variables are local to each buffer.
`url-current-mime-headers'
An assoc list of all the MIME headers for the current document.
Keyed on the lowercase MIME header (e.g., `content-type' or
`content-encoding'.
`url-current-server'
Server that the current document was retrieved from.
`url-current-file'
Filename of the current document
`url-current-type'
A string representing what network protocol was used to retrieve
the current buffer's document. Can be one of http, gopher, file,
ftp, news, or mailto.
`url-current-port'
Port # of the current document.
`w3-current-last-buffer'
The last buffer seen before this one.
`w3-running-FSF19'
This is `t' if and only if we are running in FSF Emacs 19.
`w3-running-epoch'
This is `t' if and only if we are running in Epoch 4.x
`w3-running-xemacs'
This is `t' if and only if we are running in Lucid Emacs,
WinEmacs, or XEmacs.